实体彼此交互的系统很常见。在许多交互系统中,难以观察实体之间的关系,这是用于分析系统的关键信息。近年来,在使用图形神经网络中发现实体之间的关系越来越兴趣。然而,如果关系的数量未知或者关系复杂,则难以申请现有方法。我们提出了发现潜在关系(DSLR)模型,即使关系数目未知或存在许多类型的关系,也是灵活适用的。我们的DSLR模型的灵活性来自我们的编码器的设计概念,它代表了潜在空间中的实体之间的关系,而不是可以处理许多类型的关系的离散变量和解码器。我们对实体之间的各种关系进行了关于合成和实际图数据的实验,并将定性和定量结果与其他方法进行了比较。实验表明,该方法适用于分析具有未知数量的复杂关系的动态图。
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper.
translated by 谷歌翻译
子图是图中丰富的子结构,并且可以在现实世界任务中部分观察到它们的节点和边缘。在部分观察结果下,现有的节点或子图级消息传播会产生次优表示。在本文中,我们制定了一项新的学习表征的新任务。为了解决此问题,我们建议将部分子图信息(PSI)框架(PSI)框架概括为我们的框架中现有的Infomax模型,包括DGI,Intograph,MVGRL和GraphCl。这些模型最大程度地提高了部分子图的摘要与从节点到完整子图的各种子结构之间的共同信息。此外,我们建议使用$ K $ -HOP PSI的新型两阶段模型,它重建了完整子图的表示,并提高了其从不同局部全球结构中的表现力。在为此问题设计的培训和评估协议下,我们在三个现实世界数据集上进行实验,并证明PSI模型的表现优于基准。
translated by 谷歌翻译
本文介绍了持续的Weisfeiler-Lehman随机步行方案(缩写为PWLR),用于图形表示,这是一个新型的数学框架,可生成具有离散和连续节点特征的图形的可解释的低维表示。提出的方案有效地结合了归一化的Weisfeiler-Lehman程序,在图形上随机行走以及持续的同源性。因此,我们整合了图形的三个不同属性,即局部拓扑特征,节点度和全局拓扑不变,同时保留图形扰动的稳定性。这概括了Weisfeiler-Lehman过程的许多变体,这些变体主要用于嵌入具有离散节点标签的图形。经验结果表明,可以有效地利用这些表示形式与最新的技术产生可比较的结果,以分类具有离散节点标签的图形,并在对具有连续节点特征的人分类中增强性能。
translated by 谷歌翻译
使用变压器模型,多语言神经机器的翻译一直显示出巨大的成功。部署这些模型是具有挑战性的,因为它们通常需要各种语言的大词汇(词汇)尺寸。这限制了在上一个词汇投影层中预测输出令牌的速度。为了减轻这些挑战,本文提出了一种通过聚类的快速词汇投影方法,该方法可用于GPU上的多语言变压器。首先,我们脱机将词汇搜索空间分为不同的结合群,鉴于解码器输出的隐藏上下文向量,这导致词汇投影的词汇列要小得多。其次,在推理时,提出的方法预测了词汇投影中隐藏上下文向量的簇和候选候选代币。本文还包括对在多语言环境中构建这些群集的不同方式的分析。我们的结果表明,FLOAT16 GPU推断中的端到端速度增长高达25%,同时保持BLEU得分并略有增加记忆成本。所提出的方法将词汇投影步骤加速自身最多2.6倍。我们还进行了广泛的人类评估,以验证所提出的方法保留了原始模型的翻译质量。
translated by 谷歌翻译
基于生成对抗网络(GAN-IT)的图像翻译是在胸部X射线图像(AL-CXR)中精确定位异常区域的一种有前途的方法。但是,异质的未配对数据集破坏了现有的方法来提取关键特征并将正常与异常情况区分开,从而导致不准确和不稳定的Al-CXR。为了解决这个问题,我们提出了涉及注册和数据增强的两阶段gan-it的改进。对于第一阶段,我们引入了一种可逆的基于学习的注册技术,该技术实际上和合理地将未配对的数据转换为配对数据以进行学习注册图。这种新颖的方法可实现高注册性能。在第二阶段,我们将数据扩展应用于均匀注册框架上的左右肺区域来多样化异常位置,从而通过减轻显示左和右肺病变的数据分布的不平衡来进一步改善性能。我们的方法旨在应用于现有的GAN-IT模型,从而使现有的体系结构受益于翻译的关键功能。通过证明应用AL-CXR的性能在应用提出的方法时均匀提高,我们认为即使学习数据稀缺,也可以在临床环境中部署Al-CXR的GAN-IT。
translated by 谷歌翻译
我们提出了一个新的变压器模型,用于无监督学习骨架运动序列的任务。用于基于无监督骨骼的动作学习的现有变压器模型被了解到每个关节从相邻帧的瞬时速度没有全球运动信息。因此,该模型在学习全身运动和暂时遥远的关节方面的关注方面存在困难。此外,模型中尚未考虑人与人之间的互动。为了解决全身运动,远程时间动态和人与人之间的互动的学习,我们设计了一种全球和本地的注意机制,在其中,全球身体动作和本地关节运动相互关注。此外,我们提出了一种新颖的预处理策略,即多间隔姿势位移预测,以在不同的时间范围内学习全球和本地关注。提出的模型成功地学习了关节的局部动力学,并从运动序列中捕获了全局上下文。我们的模型优于代表性基准中明显边缘的最先进模型。代码可在https://github.com/boeun-kim/gl-transformer上找到。
translated by 谷歌翻译
稀疏激活的变压器(例如专家的混合物(MOE))由于其极端的缩放能力而引起了极大的兴趣,这可以使模型大小的急剧增加而没有大幅增加计算成本。为了实现这一目标,MOE模型用变压器中的Experts子层取代了前馈子层,并使用门控网络将每个令牌路由到其指定的专家。由于对此类模型进行有效培训的共同实践需要在不同的机器上分发专家和代币,因此这种路由策略通常会产生巨大的跨机器通信成本,因为代币及其分配的专家可能居住在不同的机器中。在本文中,我们提出了\ emph {门控辍学},它允许代币忽略门控网络并留在其本地机器,从而减少了交叉机器的通信。与传统辍学类似,我们还表明,门控辍学在训练过程中具有正规化效果,从而改善了概括性能。我们验证了对多语言机器翻译任务中门控辍学的有效性。我们的结果表明,门控辍学可改善具有更快的壁式时间收敛速率的最先进的MOE模型,并为各种模型尺寸和数据集提供更好的BLEU分数。
translated by 谷歌翻译
The Annals of Joseon Dynasty (AJD) contain the daily records of the Kings of Joseon, the 500-year kingdom preceding the modern nation of Korea. The Annals were originally written in an archaic Korean writing system, `Hanja', and were translated into Korean from 1968 to 1993. The resulting translation was however too literal and contained many archaic Korean words; thus, a new expert translation effort began in 2012. Since then, the records of only one king have been completed in a decade. In parallel, expert translators are working on English translation, also at a slow pace and produced only one king's records in English so far. Thus, we propose H2KE, a neural machine translation model, that translates historical documents in Hanja to more easily understandable Korean and to English. Built on top of multilingual neural machine translation, H2KE learns to translate a historical document written in Hanja, from both a full dataset of outdated Korean translation and a small dataset of more recently translated contemporary Korean and English. We compare our method against two baselines: a recent model that simultaneously learns to restore and translate Hanja historical document and a Transformer based model trained only on newly translated corpora. The experiments reveal that our method significantly outperforms the baselines in terms of BLEU scores for both contemporary Korean and English translations. We further conduct extensive human evaluation which shows that our translation is preferred over the original expert translations by both experts and non-expert Korean speakers.
translated by 谷歌翻译